Join post-HNSW block rescore for diversifying child KNN (POC)#16047
Join post-HNSW block rescore for diversifying child KNN (POC)#16047iprithv wants to merge 2 commits into
Conversation
Optional blockRescore on DiversifyingChildren float/byte KNN; shared blockRescore() with visited accounting; tests; JMH benchmark; CHANGES (Improvements, GITHUB#15839). Relates to apache#15839
|
@benwtrent, would like to get your thoughts on this. thanks! |
|
This doesn't address that issue. This is a better and more complete solution that better shows the idea: #16034 |
|
@benwtrent Thanks for the pointer to #16034, agreed it's closer to what the issue describes, since sibling scoring happens inline during HNSW traversal and can affect early termination, while this POC only rescore parents after HNSW finishes (so it can fix parent score correctness but doesn't change traversal). A couple of clarifying questions before I close this out:
Thanks! |
Description
This implements a proof of concept for sibling scoring in block join diversified child vector search, discussed in #15839 — Maybe Improve join block Vector search performance by block scoring child vectors.
Benchmarks (JMH)
DiversifyingChildrenFloatKnnJoinBenchmark: 3 forks, 5 warmup / 10 measurement (30 samples/cell),-Xmx2g, dim 96, topK 64, 4096 parent blocks.rescoreBlocks=falserescoreBlocks=trueOverhead grows with block width (and with
topK).